Picture for Zuxuan Wu

Zuxuan Wu

Fudan University

CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping

Add code
May 29, 2026
Viaarxiv icon

VLA-Pro: Cross-Task Procedural Memory Transfer for Vision-Language-Action Models

Add code
May 28, 2026
Viaarxiv icon

Compositional Text-to-Image Generation Via Region-aware Bimodal Direct Preference Optimization

Add code
May 27, 2026
Viaarxiv icon

Channel-wise Vector Quantization

Add code
May 25, 2026
Viaarxiv icon

Baton: Explicit Semantic Blueprints for Joint Video-Audio Generation

Add code
May 24, 2026
Viaarxiv icon

DecQ: Detail-Condensing Queries for Enhanced Reconstruction and Generation in Representation Autoencoders

Add code
May 21, 2026
Viaarxiv icon

Resolving Representation Ambiguity in Feedforward Novel View Synthesis Transformer via Semantic-Spatial Decoupling

Add code
May 18, 2026
Viaarxiv icon

DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models

Add code
May 14, 2026
Viaarxiv icon

GuidedVLA: Specifying Task-Relevant Factors via Plug-and-Play Action Attention Specialization

Add code
May 12, 2026
Viaarxiv icon

Attention Itself Could Retrieve.RetrieveVGGT: Training-Free Long Context Streaming 3D Reconstruction via Query-Key Similarity Retrieval

Add code
May 10, 2026
Viaarxiv icon